منابع مشابه
Bounded Parameter Markov Decision Processes Bounded Parameter Markov Decision Processes
In this paper, we introduce the notion of a bounded parameter Markov decision process as a generalization of the traditional exact MDP. A bounded parameter MDP is a set of exact MDPs speciied by giving upper and lower bounds on transition probabilities and rewards (all the MDPs in the set share the same state and action space). Bounded parameter MDPs can be used to represent variation or uncert...
متن کاملBounded Parameter Markov Decision Processes
In this paper, we introduce the notion of a bounded-parameter Markov decision process (BMDP) as a generalization of the familiar exact MDP. A bounded-parameter MDP is a set of exact MDPs specified by giving upper and lower bounds on transition probabilities and rewards (all the MDPs in the set share the same state and action space). BMDPs form an efficiently solvable special case of the already...
متن کاملContinuous time Markov decision processes
In this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission con...
متن کاملBounded-Parameter Partially Observable Markov Decision Processes
The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-life situations, due to various reasons such as limited data for learning the model, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter...
متن کاملSolving Structured Continuous-Time Markov Decision Processes
We present an approach to solving structured continuous-time Markov decision processes. We approximate the the optimal value function by a compact linear form, resulting in a linear program. The main difficulty arises from the number of constraints that grow exponentially with the number of variables in the system. We exploit the representation of continuous-time Bayesian networks (CTBNs) to de...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Annals of Probability
سال: 1978
ISSN: 0091-1798
DOI: 10.1214/aop/1176995530